-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add OpenAI Responses API Support #5037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add OpenAI Responses API Support #5037
Conversation
## Overview This PR adds support for OpenAI's new **Responses API** to the `OpenAiApi` class, providing low-level access to OpenAI's latest agentic capabilities. The Responses API represents OpenAI's unified interface for building agent-like applications with built-in tools, multi-turn conversations, and enhanced reasoning capabilities. **Important**: This PR adds support at the **low-level API layer only** (`OpenAiApi` class). It does not integrate with the high-level `ChatModel` abstractions. The Responses API appears to be a stateful, standalone application (OpenAI's latest agentic attempt) rather than a traditional chat model. It doesn't fit the existing `ChatModel` abstractions and isn't easily integrated as another chat-model provider. It represents a new agentic category entirely. ## Related Issues - Closes spring-projects#4221 - Support for OpenAI Responses API - Related to spring-projects#2962 - Enhanced reasoning model support - Related to spring-projects#3022 - Multi-turn conversation handling ## Changes ### 1. Core API Support (`OpenAiApi.java`) #### Added DTOs **Request DTO - `ResponseRequest`**: - Comprehensive request object with 24 parameters - Parameters include: `model`, `input`, `instructions`, `temperature`, `tools`, `reasoning`, `conversation`, `previousResponseId`, etc. - Supports all Responses API features: reasoning models, built-in tools, structured outputs, multi-turn conversations - Includes nested records: `TextConfig`, `TextFormat`, `ReasoningConfig` **Response DTO - `Response`**: - Complete response structure with `id`, `status`, `model`, `output`, `usage`, etc. - Nested records: `OutputItem`, `ContentItem`, `ReasoningDetails`, `ResponseError`, `IncompleteDetails` - Supports multiple output types: messages, reasoning, tool calls **Streaming DTO - `ResponseStreamEvent`**: - Event-based streaming support - Includes: `type`, `sequenceNumber`, `response`, `delta`, `text`, etc. - Enables real-time processing of responses #### Added Methods - `responseEntity(ResponseRequest)` - Synchronous response creation - `responseEntity(ResponseRequest, HttpHeaders)` - Synchronous with custom headers - `responseStream(ResponseRequest)` - Streaming response creation - `responseStream(ResponseRequest, HttpHeaders)` - Streaming with custom headers #### Added Configuration - `responsesPath` field (default: `/v1/responses`) - Builder support for responses path configuration - Updated constructors to include responses path ### 2. Autoconfiguration Support #### `OpenAiChatProperties.java` - Added `responsesPath` property with default value `/v1/responses` - Added getter/setter methods - Follows same pattern as `completionsPath` and `embeddingsPath` #### `OpenAiChatAutoConfiguration.java` - Updated `openAiApi()` bean to include `.responsesPath(chatProperties.getResponsesPath())` - Enables Spring Boot property configuration #### `OpenAiEmbeddingAutoConfiguration.java` - Updated `openAiApi()` method to include responses path - Uses default constant for consistency ### 3. Integration Tests (`OpenAiApiIT.java`) Added 4 comprehensive integration tests: 1. **`responseEntity()`** - Basic synchronous response - Tests simple request/response flow - Validates response structure and content - Cost: ~10-20 tokens 2. **`responseStream()`** - Streaming responses - Tests event stream processing - Validates multiple event types - Cost: ~10-20 tokens 3. **`responseWithInstructionsAndConfiguration()`** - Advanced configuration - Tests system instructions and parameters - Validates parameter echo and content accuracy - Cost: ~10-20 tokens 4. **`responseWithWebSearchTool()`** - Built-in web_search tool - Demonstrates built-in tool usage (no custom implementation needed) - Tests tool execution and response handling - Validates output structure with tool calls - Cost: ~30-50 tokens **Total estimated cost**: ~$0.0002 - $0.0005 per test run ### 4. Unit Tests (`ResponsesApiTest.java`) Added comprehensive unit tests covering: - `ResponseRequest` creation with various parameter combinations - `Response` structure validation - `ResponseStreamEvent` structure validation - Convenience constructors ### 5. Documentation Updates #### `openai-chat.adoc` - Added `spring.ai.openai.chat.responses-path` property documentation - Updated Chat Completions API references for clarity - Changed note about Responses API availability (now supported via `OpenAiApi`) ## Key Features ### Built-in Tools The Responses API provides tools without custom implementation: - **`web_search`** - Search the internet (demonstrated in integration test) - **`file_search`** - Search through uploaded files - **`code_interpreter`** - Execute Python code - **`computer_use`** - Interact with computer interfaces - Remote MCPs (Model Context Protocol) ### Multi-turn Conversations Native support for stateful conversations: - Via `previousResponseId` parameter - Via `conversation` object/ID ### Reasoning Models Enhanced support for reasoning models (gpt-5, o-series): - Configurable reasoning effort levels - Access to reasoning content and summaries ### Structured Outputs JSON schema validation via `TextConfig`: - Type-safe structured responses - Schema validation with `strict` mode ## Configuration ### Default Configuration (Minimal) ```yaml spring: ai: openai: api-key: ${OPENAI_API_KEY} ``` ### Custom Configuration ```yaml spring: ai: openai: api-key: ${OPENAI_API_KEY} chat: responses-path: /v1/responses # Can be customized for compatible servers ``` ## Usage Examples ### Basic Synchronous Request ```java @Autowired private OpenAiApi openAiApi; public void example() { var request = new OpenAiApi.ResponseRequest("What is AI?", "gpt-4o"); ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request); // Extract text from response String text = response.getBody() .output() .stream() .filter(item -> "message".equals(item.type())) .flatMap(item -> item.content().stream()) .filter(content -> "output_text".equals(content.type())) .map(OpenAiApi.Response.ContentItem::text) .findFirst() .orElse(null); } ``` ### Streaming Request ```java var request = new OpenAiApi.ResponseRequest("Tell me a story", "gpt-4o", true); Flux<OpenAiApi.ResponseStreamEvent> stream = openAiApi.responseStream(request); stream.subscribe(event -> { if ("response.output_text.delta".equals(event.type())) { System.out.print(event.delta()); } }); ``` ### Using Built-in Web Search Tool ```java var webSearchTool = Map.of("type", "web_search"); var request = new OpenAiApi.ResponseRequest( "gpt-4o", "What is the current weather in San Francisco?", null, null, null, null, null, List.of(webSearchTool), // Enable web_search tool null, null, false, null, null, null, null, null, null, List.of("web_search_call.action.sources"), // Include search sources null, null, null, null, null, null ); ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request); ``` ### Multi-turn Conversation ```java // First request var request1 = new OpenAiApi.ResponseRequest("What is 2+2?", "gpt-4o"); var response1 = openAiApi.responseEntity(request1); String responseId = response1.getBody().id(); // Follow-up request var request2 = new OpenAiApi.ResponseRequest( "gpt-4o", "And what is that number multiplied by 3?", null, null, null, null, null, null, null, null, false, null, null, null, responseId, // Reference previous response null, null, null, null, null, null, null, null, null ); var response2 = openAiApi.responseEntity(request2); ``` ## Design Decisions ### Why Low-Level API Only? The Responses API is fundamentally different from traditional chat models: 1. **Stateful vs Stateless**: The Responses API is designed for stateful, multi-turn agent applications, while `ChatModel` is stateless 2. **Built-in Tools**: Responses API provides native tools (web search, file search, etc.) without custom implementation, unlike `ChatModel`'s function calling 3. **Different Abstractions**: The output structure (`output` array with multiple item types) doesn't map cleanly to `ChatResponse` 4. **Agent-First Design**: Represents a new category of agentic applications rather than a traditional chat interface 5. **Future Evolution**: OpenAI is positioning this as the future of agent development, separate from chat completions ### Implementation Patterns 1. **Follows Existing Conventions**: Mirrors `chatCompletionEntity` and `chatCompletionStream` patterns 2. **Comprehensive DTOs**: All major API fields included for maximum flexibility 3. **Convenience Constructors**: Simplified constructors for common use cases 4. **Type Safety**: Uses Java records for immutable, type-safe DTOs 5. **Spring Boot Integration**: Full support for externalized configuration ## Backward Compatibility ✅ **Fully backward compatible** - No changes to existing `ChatModel` implementations - No changes to existing Chat Completions API usage - New functionality is additive only - Default values match OpenAI standards ## Testing ### Unit Tests - ✅ 5 unit tests in `ResponsesApiTest` - ✅ All existing tests continue to pass - ✅ No compilation errors ### Integration Tests - ✅ 4 new integration tests in `OpenAiApiIT` - ✅ Cover synchronous, streaming, configuration, and built-in tools - ✅ Minimal cost (~$0.0002-$0.0005 per run) - ✅ Serve as usage examples ### Build Verification - ✅ `spring-ai-openai` module builds successfully - ✅ `spring-ai-autoconfigure-model-openai` module builds successfully - ✅ All existing tests pass ## Benefits 1. **Early Access**: Enables developers to use OpenAI's latest agentic capabilities 2. **Built-in Tools**: Simplifies integration with web search, file search, etc. 3. **Future-Ready**: Positions Spring AI for OpenAI's agent-first direction 4. **Flexible**: Low-level API allows custom abstractions to be built on top 5. **Well-Documented**: Comprehensive tests serve as usage examples 6. **Cost-Efficient**: Integration tests designed to minimize API costs ## Future Enhancements Potential future additions (not in this PR): 1. Higher-level abstractions if patterns emerge 2. Conversation management utilities 3. Response accumulator helpers for streaming 4. Observability support for Responses API calls 5. Integration with Spring AI's advisor pattern (if applicable) ## Migration from Chat Completions For users wanting to try the Responses API: | Chat Completions | Responses API | |------------------|---------------| | `messages` array | `input` (simplified) | | Custom function implementation | Built-in tools (no code needed) | | Manual conversation state | Native multi-turn support | | Limited reasoning access | Full reasoning capabilities | | `ChatCompletionRequest` | `ResponseRequest` | ## References - [OpenAI Responses API Documentation](https://platform.openai.com/docs/api-reference/responses) - [OpenAI Migration Guide](https://platform.openai.com/docs/guides/migrate-to-responses) - [Responses vs Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions) - [OpenAI Java SDK](https://github.com/openai/openai-java) - Referenced for implementation patterns ## Checklist - [x] Code compiles without errors - [x] All existing tests pass - [x] New unit tests added and passing - [x] New integration tests added and passing - [x] Documentation updated - [x] Autoconfiguration support added - [x] Spring Boot properties supported - [x] Backward compatible - [x] Follows existing code conventions - [x] No breaking changes --- **Note**: This PR intentionally does **not** integrate the Responses API with the high-level `ChatModel` abstractions. The Responses API represents a fundamentally different paradigm (stateful agents vs stateless chat) that doesn't fit the existing abstractions. This low-level API access allows the Spring AI community to experiment and potentially develop appropriate higher-level abstractions in the future. Signed-off-by: Dmitry Bedrin <dmitry.bedrin@gmail.com>
Why are you saying this @bedrin according to https://platform.openai.com/docs/guides/migrate-to-responses OpenAI themselves are recommending new projects to use the new Responses API
Some models e.g. gpt-5.1-codex do not support the Chat Completions API and only support the Responses API. I believe that there can be a This PR is a step in a good direction. However, I do believe that Spring AI itself should support the Responses API through the |
|
@filiphr I meant that we cannot use all functionality provided by new Responses API in existing Spring AI ChatModel abstraction. But indeed we can still use it - you're right. But I think it should be a separate PR. |
Overview
This PR adds support for OpenAI's new Responses API to the
OpenAiApiclass, providing low-level access to OpenAI's latest agentic capabilities. The Responses API represents OpenAI's unified interface for building agent-like applications with built-in tools, multi-turn conversations, and enhanced reasoning capabilities.Important: This PR adds support at the low-level API layer only (
OpenAiApiclass). It does not integrate with the high-levelChatModelabstractions. The Responses API appears to be a stateful, standalone application (OpenAI's latest agentic attempt) rather than a traditional chat model. It doesn't fit the existingChatModelabstractions and isn't easily integrated as another chat-model provider. It represents a new agentic category entirely.Related Issues
Changes
1. Core API Support (
OpenAiApi.java)Added DTOs
Request DTO -
ResponseRequest:model,input,instructions,temperature,tools,reasoning,conversation,previousResponseId, etc.TextConfig,TextFormat,ReasoningConfigResponse DTO -
Response:id,status,model,output,usage, etc.OutputItem,ContentItem,ReasoningDetails,ResponseError,IncompleteDetailsStreaming DTO -
ResponseStreamEvent:type,sequenceNumber,response,delta,text, etc.Added Methods
responseEntity(ResponseRequest)- Synchronous response creationresponseEntity(ResponseRequest, HttpHeaders)- Synchronous with custom headersresponseStream(ResponseRequest)- Streaming response creationresponseStream(ResponseRequest, HttpHeaders)- Streaming with custom headersAdded Configuration
responsesPathfield (default:/v1/responses)2. Autoconfiguration Support
OpenAiChatProperties.javaresponsesPathproperty with default value/v1/responsescompletionsPathandembeddingsPathOpenAiChatAutoConfiguration.javaopenAiApi()bean to include.responsesPath(chatProperties.getResponsesPath())OpenAiEmbeddingAutoConfiguration.javaopenAiApi()method to include responses path3. Integration Tests (
OpenAiApiIT.java)Added 4 comprehensive integration tests:
responseEntity()- Basic synchronous responseresponseStream()- Streaming responsesresponseWithInstructionsAndConfiguration()- Advanced configurationresponseWithWebSearchTool()- Built-in web_search toolTotal estimated cost: ~$0.0002 - $0.0005 per test run
4. Unit Tests (
ResponsesApiTest.java)Added comprehensive unit tests covering:
ResponseRequestcreation with various parameter combinationsResponsestructure validationResponseStreamEventstructure validation5. Documentation Updates
openai-chat.adocspring.ai.openai.chat.responses-pathproperty documentationOpenAiApi)Key Features
Built-in Tools
The Responses API provides tools without custom implementation:
web_search- Search the internet (demonstrated in integration test)file_search- Search through uploaded filescode_interpreter- Execute Python codecomputer_use- Interact with computer interfacesMulti-turn Conversations
Native support for stateful conversations:
previousResponseIdparameterconversationobject/IDReasoning Models
Enhanced support for reasoning models (gpt-5, o-series):
Structured Outputs
JSON schema validation via
TextConfig:strictmodeConfiguration
Default Configuration (Minimal)
Custom Configuration
Usage Examples
Basic Synchronous Request
Streaming Request
Using Built-in Web Search Tool
Multi-turn Conversation
Design Decisions
Why Low-Level API Only?
The Responses API is fundamentally different from traditional chat models:
ChatModelis statelessChatModel's function callingoutputarray with multiple item types) doesn't map cleanly toChatResponseImplementation Patterns
chatCompletionEntityandchatCompletionStreampatternsBackward Compatibility
✅ Fully backward compatible
ChatModelimplementationsTesting
Unit Tests
ResponsesApiTestIntegration Tests
OpenAiApiITBuild Verification
spring-ai-openaimodule builds successfullyspring-ai-autoconfigure-model-openaimodule builds successfullyBenefits
Future Enhancements
Potential future additions (not in this PR):
References
Note: This PR intentionally does not integrate the Responses API with the high-level
ChatModelabstractions. The Responses API represents a fundamentally different paradigm (stateful agents vs stateless chat) that doesn't fit the existing abstractions. This low-level API access allows the Spring AI community to experiment and potentially develop appropriate higher-level abstractions in the future.